• Conference Object  

      Data parallel acceleration of decision support queries using cell/BE and GPUs 

      Trancoso, Pedro; Othonos, D.; Artemiou, A. (2009)
      Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. ...
    • Article  

      Data-Driven Thread Execution on Heterogeneous Processors 

      Arandi, Samer; Matheou, George; Kyriacou, Costas; Evripidou, Paraskevas (2017)
      In this paper we report our experience in implementing and evaluating the Data-Driven Multithreading (DDM) model on a heterogeneous multi-core processor. DDM is a non-blocking multithreading model that decouples the ...
    • Conference Object  

      DDM-CMP: Data-driven multithreading on a chip multiprocessor 

      Stavrou, Kyriakos; Evripidou, Paraskevas; Trancoso, Pedro (2005)
      High-end microprocessors achieve their performance as a result of adding more features and therefore increasing their complexity. In this paper we present DDM-CMP, a Chip-Multiprocessor using the Data-Driven Multithreading ...
    • Conference Object  

      Disconnected quark loop contributions to nucleon observables using Nf = 2 twisted clover fermions at the physical value of the light quark mass 

      Abdel-Rehim, A.; Alexandrou, Constantia; Constantinou, Martha; Hadjiyiannakou, Kyriakos; Jansen, K.; Kallidonis, Christos; Koutsou, Giannis; Avilés-Casco, A. V. (Proceedings of Science (PoS), 2015)
      We compute the disconnected quark loops contributions entering the determination of nucleon observables, by using a Nf = 2 ensemble of twisted mass fermions with a clover term at a pION mass mπ = 133 MeV. We employ exact ...
    • Conference Object  

      Energy efficient stream-based configurable architecture for embedded platforms 

      Pratas, F.; Tomas, P.; Trancoso, Pedro; Sousa, L. (2012)
      Reconfigurable hardware can be used as an energy and performance efficient co-processing solution to accelerate certain types of applications. To facilitate the design of hardware accelerators we have proposed a methodology ...
    • Article  

      Evaluation of disconnected quark loops for hadron structure using GPUs 

      Alexandrou, Constantia; Constantinou, Martha; Drach, V.; Hadjiyiannakou, Kyriakos; Jansen, K.; Koutsou, Giannis; Strelchenko, A.; Vaquero, A. (2014)
      A number of stochastic methods developed for the calculation of fermion loops are investigated and compared, in particular with respect to their efficiency when implemented on Graphics Processing Units (GPUs). We assess ...
    • Article  

      Evaluation of fermion loops applied to the calculation of the η ′ mass and the nucleon scalar and electromagnetic form factors 

      Alexandrou, Constantia; Hadjiyiannakou, Kyriakos; Koutsou, Giannis; O Cais, A.; Strelchenko, A. (2012)
      The exact evaluation of the disconnected diagram contributions to the flavor-singlet pseudo-scalar meson mass, the nucleon σ-term and the nucleon electromagnetic form factors is carried out utilizing GPGPU technology with ...
    • Conference Object  

      Exploring graphics processor performance for general purpose applications 

      Trancoso, Pedro; Charalambous, Maria (2005)
      Graphics processors are designed to perform many floating-point operations per second. Consequently, they are an attractive architecture for high-performance computing at a low cost. Nevertheless, it is still not very clear ...
    • Article  

      A Family of Resource-Bound Real-Time Process Algebras 

      Lee, I.; Philippou, Anna; Sokolsky, O. (2006)
      The Algebra of Communicating Shared Resources (ACSR) is a timed process algebra which extends classical process algebras with the notion of a resource. It takes the view that the timing behavior of a real-time system depends ...
    • Article  

      Fine-grain parallelism using multi-core, Cell/BE, and GPU systems 

      Pratas, F.; Trancoso, Pedro; Sousa, L.; Stamatakis, A.; Shi, G.; Kindratenko, V. (2012)
      Currently, we are facing a situation where applications exhibit increasing computational demands and where a large variety of parallel processor systems are available. In this paper we focus on exploiting fine-grain ...
    • Conference Object  

      Fine-grain parallelism using multi-core, cell/BE, and GPU systems: Accelerating the phylogenetic likelihood function 

      Pratas, F.; Trancoso, Pedro; Stamatakis, A.; Sousa, L. (2009)
      We are currently faced with the situation where applications have increasing computational demands and there is a wide selection of parallel processor systems. In this paper we focus on exploiting fine-grain parallelism ...
    • Article  

      A flexible personalization architecture for wireless Internet based on mobile agents 

      Samaras, George S.; Panayiotou, Christoforos (2002)
      The explosive growth of the Internet has fuelled the creation of new and exciting information services. Most of the current technology has been designed for desktop and larger computers with medium to high bandwidth and ...
    • Conference Object  

      How to compare the performance of two SMT microarchitectures 

      Sazeides, Yiannakis; Juan, T. (Institute of Electrical and Electronics Engineers Inc., 2001)
      In this paper we discuss methods and metrics for comparing the performance of two simultaneous multithreading microarchitectures. We identify conditions under which the instructions-per-cycle metric may be misleading for ...
    • Conference Object  

      Implicit-storing and redundant-encoding-of-attribute information in error-correction-codes 

      Sazeides, Yiannakis; Özer, E.; Kershaw, D.; Nikolaou, Panagiota; Kleanthous, Marios M.; Abella, J. (2013)
      This paper proposes implicit-storing to extend the logical capacity of a memory array without increasing its physical capacity by leveraging the array's error-correction-codes to infer the implicitly stored bits. ...
    • Article  

      Initial experiences porting a bioinformatics application to a graphics processor 

      Charalambous, Maria; Trancoso, Pedro; Stamatakis, A. (2005)
      Bioinformatics applications are one of the most relevant and compute-demanding applications today. While normally these applications are executed on clusters or dedicated parallel systems, in this work we explore the use ...
    • Conference Object  

      Modeling program predictability 

      Sazeides, Yiannakis; Smith, James E. (IEEE Comp Soc, 1998)
      Basic properties of program predictability - for both values and control - are defined and studied. We take the view that program predictability originates at certain points during a program's execution, flows through ...
    • Conference Object  

      Modeling the implications of DRAM failures and protection techniques on datacenter TCO 

      Nikolaou, Panagiota; Sazeides, Yiannakis; Ndreu, L.; Kleanthous, Marios M. (IEEE Computer Society, 2015)
      Total Cost of Ownership (TCO) is a key optimization metric for the design of a datacenter. This paper proposes, for the first time, a framework for modeling the implications of DRAM failures and DRAM error protection ...
    • Conference Object  

      Modeling value speculation 

      Sazeides, Yiannakis (IEEE Computer Society, 2002)
      Several studies of speculative execution based on values have reported promising performance potential. However, virtually all microarchitectures in these studies were described in an ambiguous manner, mainly due to the ...
    • Conference Object  

      A parallel implementation of a multi-objective evolutionary algorithm 

      Kannas, Christos C.; Nicolaou, Christos A.; Pattichis, Constantinos S. (2009)
      Multi-objective Evolutionary Algorithms (MOEAs) have features that can be exploited to harness the processing power offered by modern multi-core CPUs. Modern programming languages offer the ability to use threads and ...
    • Conference Object  

      The performance vulnerability of architectural and non-architectural arrays to permanent faults 

      Hardy, D.; Sideris, I.; Ladas, N.; Sazeides, Yiannakis (2012)
      This paper presents a first-order analytical model for determining the performance degradation caused by permanently faulty cells in architectural and non-architectural arrays. We refer to this degradation as the performance ...